Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 603416 |
| Missing cells | 344234 |
| Missing cells (%) | 3.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 92.1 MiB |
| Average record size in memory | 160.0 B |
Variable types
| Categorical | 8 |
|---|---|
| Numeric | 11 |
DBN has a high cardinality: 1631 distinct values | High cardinality |
School Name has a high cardinality: 1627 distinct values | High cardinality |
% Attendance is highly correlated with % Attendance_5_yr_avg | High correlation |
% Chronically Absent is highly correlated with % Chronically Absent_5_yr_avg | High correlation |
Next Year % Chronically Absent is highly correlated with % Chronically Absent_5_yr_avg | High correlation |
% Attendance_2_yr_avg is highly correlated with % Attendance_5_yr_avg | High correlation |
% Chronically Absent_2_yr_avg is highly correlated with % Chronically Absent_5_yr_avg | High correlation |
% Attendance_5_yr_avg is highly correlated with % Attendance and 1 other fields | High correlation |
% Chronically Absent_5_yr_avg is highly correlated with % Chronically Absent and 2 other fields | High correlation |
Borough_Name is highly correlated with Borough_Code | High correlation |
Borough_Code is highly correlated with Borough_Name | High correlation |
Next Year % Chronically Absent has 149460 (24.8%) missing values | Missing |
Chronically_Absent_Next_Year has 149460 (24.8%) missing values | Missing |
% Attendance_2_yr_avg has 20151 (3.3%) missing values | Missing |
% Chronically Absent_2_yr_avg has 20151 (3.3%) missing values | Missing |
% Chronically Absent has 18887 (3.1%) zeros | Zeros |
Next Year % Chronically Absent has 11965 (2.0%) zeros | Zeros |
% Chronically Absent_2_yr_avg has 7352 (1.2%) zeros | Zeros |
Reproduction
| Analysis started | 2021-01-11 23:57:47.464896 |
|---|---|
| Analysis finished | 2021-01-12 00:03:38.527703 |
| Duration | 5 minutes and 51.06 seconds |
| Software version | pandas-profiling v2.10.0 |
| Download configuration | config.yaml |
| Distinct | 1631 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.2 MiB |
| 31R080 | 945 |
|---|---|
| 01M539 | 885 |
| 75Q993 | 797 |
| 21K095 | 763 |
| 30Q122 | 760 |
| Other values (1626) |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 3620496 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 01M015 |
|---|---|
| 2nd row | 01M015 |
| 3rd row | 01M015 |
| 4th row | 01M015 |
| 5th row | 01M015 |
| Value | Count | Frequency (%) |
| 31R080 | 945 | 0.2% |
| 01M539 | 885 | 0.1% |
| 75Q993 | 797 | 0.1% |
| 21K095 | 763 | 0.1% |
| 30Q122 | 760 | 0.1% |
| 25Q219 | 748 | 0.1% |
| 22K206 | 747 | 0.1% |
| 20K104 | 742 | 0.1% |
| 21K225 | 740 | 0.1% |
| 21K226 | 736 | 0.1% |
| Other values (1621) | 595553 |
| Value | Count | Frequency (%) |
| 31r080 | 945 | 0.2% |
| 01m539 | 885 | 0.1% |
| 75q993 | 797 | 0.1% |
| 21k095 | 763 | 0.1% |
| 30q122 | 760 | 0.1% |
| 25q219 | 748 | 0.1% |
| 22k206 | 747 | 0.1% |
| 20k104 | 742 | 0.1% |
| 21k225 | 740 | 0.1% |
| 21k226 | 736 | 0.1% |
| Other values (1621) | 595553 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 531023 | |
| 1 | 519950 | |
| 2 | 494732 | |
| 3 | 283136 | |
| 5 | 245159 | |
| 4 | 222205 | 6.1% |
| 7 | 194560 | 5.4% |
| 9 | 179402 | 5.0% |
| 6 | 179191 | 4.9% |
| K | 178769 | 4.9% |
| Other values (5) | 592369 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3017080 | |
| Uppercase Letter | 603416 | 16.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 531023 | |
| 1 | 519950 | |
| 2 | 494732 | |
| 3 | 283136 | |
| 5 | 245159 | |
| 4 | 222205 | |
| 7 | 194560 | 6.4% |
| 9 | 179402 | 5.9% |
| 6 | 179191 | 5.9% |
| 8 | 167722 | 5.6% |
| Value | Count | Frequency (%) |
| K | 178769 | |
| Q | 148530 | |
| X | 133381 | |
| M | 110824 | |
| R | 31912 | 5.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3017080 | |
| Latin | 603416 | 16.7% |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 531023 | |
| 1 | 519950 | |
| 2 | 494732 | |
| 3 | 283136 | |
| 5 | 245159 | |
| 4 | 222205 | |
| 7 | 194560 | 6.4% |
| 9 | 179402 | 5.9% |
| 6 | 179191 | 5.9% |
| 8 | 167722 | 5.6% |
| Value | Count | Frequency (%) |
| K | 178769 | |
| Q | 148530 | |
| X | 133381 | |
| M | 110824 | |
| R | 31912 | 5.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3620496 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 531023 | |
| 1 | 519950 | |
| 2 | 494732 | |
| 3 | 283136 | |
| 5 | 245159 | |
| 4 | 222205 | 6.1% |
| 7 | 194560 | 5.4% |
| 9 | 179402 | 5.0% |
| 6 | 179191 | 4.9% |
| K | 178769 | 4.9% |
| Other values (5) | 592369 |
| Distinct | 1627 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.2 MiB |
| P.S. 212 | 1000 |
|---|---|
| P.S. 253 | 950 |
| The Michael J. Petrides School | 945 |
| New Explorations into Science, Technology and Math | 885 |
| P.S. Q993 | 797 |
| Other values (1622) |
Length
| Max length | 50 |
|---|---|
| Median length | 26 |
| Mean length | 27.2181563 |
| Min length | 5 |
Characters and Unicode
| Total characters | 16423871 |
|---|---|
| Distinct characters | 74 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | P.S. 015 Roberto Clemente |
|---|---|
| 2nd row | P.S. 015 Roberto Clemente |
| 3rd row | P.S. 015 Roberto Clemente |
| 4th row | P.S. 015 Roberto Clemente |
| 5th row | P.S. 015 Roberto Clemente |
| Value | Count | Frequency (%) |
| P.S. 212 | 1000 | 0.2% |
| P.S. 253 | 950 | 0.2% |
| The Michael J. Petrides School | 945 | 0.2% |
| New Explorations into Science, Technology and Math | 885 | 0.1% |
| P.S. Q993 | 797 | 0.1% |
| P.S. 095 The Gravesend | 763 | 0.1% |
| P.S. 122 Mamie Fay | 760 | 0.1% |
| P.S. 219 Paul Klapper | 748 | 0.1% |
| P.S. 206 Joseph F Lamb | 747 | 0.1% |
| P.S./I.S. 104 The Fort Hamilton School | 742 | 0.1% |
| Other values (1617) | 595079 |
| Value | Count | Frequency (%) |
| p.s | 315877 | 11.5% |
| school | 212281 | 7.7% |
| the | 85246 | 3.1% |
| high | 72197 | 2.6% |
| for | 58116 | 2.1% |
| academy | 56152 | 2.0% |
| and | 40069 | 1.5% |
| of | 37497 | 1.4% |
| 28244 | 1.0% | |
| bronx | 19556 | 0.7% |
| Other values (2047) | 1826951 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2149195 | 13.1% | |
| o | 1094775 | 6.7% |
| e | 1072554 | 6.5% |
| . | 890224 | 5.4% |
| a | 803404 | 4.9% |
| S | 729290 | 4.4% |
| r | 713156 | 4.3% |
| l | 707878 | 4.3% |
| n | 705911 | 4.3% |
| i | 611049 | 3.7% |
| Other values (64) | 6946435 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 9482943 | |
| Uppercase Letter | 2697399 | 16.4% |
| Space Separator | 2149195 | 13.1% |
| Decimal Number | 1110749 | 6.8% |
| Other Punctuation | 957395 | 5.8% |
| Dash Punctuation | 21341 | 0.1% |
| Open Punctuation | 3062 | < 0.1% |
| Close Punctuation | 1787 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| S | 729290 | |
| P | 409824 | |
| H | 166579 | 6.2% |
| A | 163245 | 6.1% |
| C | 136037 | 5.0% |
| M | 130394 | 4.8% |
| T | 130253 | 4.8% |
| B | 105152 | 3.9% |
| E | 97566 | 3.6% |
| L | 80184 | 3.0% |
| Other values (16) | 548875 |
| Value | Count | Frequency (%) |
| o | 1094775 | |
| e | 1072554 | |
| a | 803404 | 8.5% |
| r | 713156 | 7.5% |
| l | 707878 | 7.5% |
| n | 705911 | 7.4% |
| i | 611049 | 6.4% |
| h | 597790 | 6.3% |
| c | 503572 | 5.3% |
| t | 455917 | 4.8% |
| Other values (16) | 2216937 |
| Value | Count | Frequency (%) |
| 1 | 211253 | |
| 0 | 210355 | |
| 2 | 145962 | |
| 3 | 100034 | |
| 5 | 77196 | 6.9% |
| 9 | 76649 | 6.9% |
| 4 | 75575 | 6.8% |
| 6 | 74733 | 6.7% |
| 8 | 72821 | 6.6% |
| 7 | 66171 | 6.0% |
| Value | Count | Frequency (%) |
| . | 890224 | |
| / | 24246 | 2.5% |
| , | 20117 | 2.1% |
| & | 9555 | 1.0% |
| ' | 6826 | 0.7% |
| : | 6077 | 0.6% |
| \ | 276 | < 0.1% |
| @ | 74 | < 0.1% |
| Value | Count | Frequency (%) |
| 2149195 |
| Value | Count | Frequency (%) |
| - | 21341 |
| Value | Count | Frequency (%) |
| ( | 3062 |
| Value | Count | Frequency (%) |
| ) | 1787 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 12180342 | |
| Common | 4243529 | 25.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| o | 1094775 | 9.0% |
| e | 1072554 | 8.8% |
| a | 803404 | 6.6% |
| S | 729290 | 6.0% |
| r | 713156 | 5.9% |
| l | 707878 | 5.8% |
| n | 705911 | 5.8% |
| i | 611049 | 5.0% |
| h | 597790 | 4.9% |
| c | 503572 | 4.1% |
| Other values (42) | 4640963 |
| Value | Count | Frequency (%) |
| 2149195 | ||
| . | 890224 | |
| 1 | 211253 | 5.0% |
| 0 | 210355 | 5.0% |
| 2 | 145962 | 3.4% |
| 3 | 100034 | 2.4% |
| 5 | 77196 | 1.8% |
| 9 | 76649 | 1.8% |
| 4 | 75575 | 1.8% |
| 6 | 74733 | 1.8% |
| Other values (12) | 232353 | 5.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 16423871 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 2149195 | 13.1% | |
| o | 1094775 | 6.7% |
| e | 1072554 | 6.5% |
| . | 890224 | 5.4% |
| a | 803404 | 4.9% |
| S | 729290 | 4.4% |
| r | 713156 | 4.3% |
| l | 707878 | 4.3% |
| n | 705911 | 4.3% |
| i | 611049 | 3.7% |
| Other values (64) | 6946435 |
Grade
Categorical
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.2 MiB |
| All Grades | |
|---|---|
| 1 | |
| 2 | |
| 0K | |
| 3 | |
| Other values (10) |
Length
| Max length | 18 |
|---|---|
| Median length | 1 |
| Mean length | 3.494933843 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2108899 |
|---|---|
| Distinct characters | 28 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | All Grades |
|---|---|
| 2nd row | PK in K-12 Schools |
| 3rd row | 0K |
| 4th row | 1 |
| 5th row | 2 |
| Value | Count | Frequency (%) |
| All Grades | 117285 | |
| 1 | 47545 | 7.9% |
| 2 | 47314 | 7.8% |
| 0K | 46364 | 7.7% |
| 3 | 46160 | 7.6% |
| 4 | 45008 | 7.5% |
| 5 | 43816 | 7.3% |
| 6 | 28560 | 4.7% |
| 8 | 28508 | 4.7% |
| 9 | 28460 | 4.7% |
| Other values (5) | 124396 |
| Value | Count | Frequency (%) |
| all | 117285 | |
| grades | 117285 | |
| 1 | 47545 | 6.1% |
| 2 | 47314 | 6.1% |
| 0k | 46364 | 6.0% |
| 3 | 46160 | 5.9% |
| 4 | 45008 | 5.8% |
| 5 | 43816 | 5.6% |
| 6 | 28560 | 3.7% |
| 8 | 28508 | 3.7% |
| Other values (9) | 210465 |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 253773 | 12.0% |
| 174894 | 8.3% | |
| 1 | 168664 | 8.0% |
| s | 136488 | 6.5% |
| A | 117285 | 5.6% |
| G | 117285 | 5.6% |
| r | 117285 | 5.6% |
| a | 117285 | 5.6% |
| d | 117285 | 5.6% |
| e | 117285 | 5.6% |
| Other values (18) | 671370 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 974619 | |
| Decimal Number | 582437 | |
| Uppercase Letter | 357746 | 17.0% |
| Space Separator | 174894 | 8.3% |
| Dash Punctuation | 19203 | 0.9% |
Most frequent character per category
| Value | Count | Frequency (%) |
| l | 253773 | |
| s | 136488 | |
| r | 117285 | |
| a | 117285 | |
| d | 117285 | |
| e | 117285 | |
| o | 38406 | 3.9% |
| i | 19203 | 2.0% |
| n | 19203 | 2.0% |
| c | 19203 | 2.0% |
| Value | Count | Frequency (%) |
| 1 | 168664 | |
| 2 | 91391 | |
| 0 | 73780 | |
| 3 | 46160 | 7.9% |
| 4 | 45008 | 7.7% |
| 5 | 43816 | 7.5% |
| 6 | 28560 | 4.9% |
| 8 | 28508 | 4.9% |
| 9 | 28460 | 4.9% |
| 7 | 28090 | 4.8% |
| Value | Count | Frequency (%) |
| A | 117285 | |
| G | 117285 | |
| K | 84770 | |
| P | 19203 | 5.4% |
| S | 19203 | 5.4% |
| Value | Count | Frequency (%) |
| 174894 |
| Value | Count | Frequency (%) |
| - | 19203 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1332365 | |
| Common | 776534 |
Most frequent character per script
| Value | Count | Frequency (%) |
| l | 253773 | |
| s | 136488 | |
| A | 117285 | |
| G | 117285 | |
| r | 117285 | |
| a | 117285 | |
| d | 117285 | |
| e | 117285 | |
| K | 84770 | 6.4% |
| o | 38406 | 2.9% |
| Other values (6) | 115218 |
| Value | Count | Frequency (%) |
| 174894 | ||
| 1 | 168664 | |
| 2 | 91391 | |
| 0 | 73780 | |
| 3 | 46160 | 5.9% |
| 4 | 45008 | 5.8% |
| 5 | 43816 | 5.6% |
| 6 | 28560 | 3.7% |
| 8 | 28508 | 3.7% |
| 9 | 28460 | 3.7% |
| Other values (2) | 47293 | 6.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2108899 |
Most frequent character per block
| Value | Count | Frequency (%) |
| l | 253773 | 12.0% |
| 174894 | 8.3% | |
| 1 | 168664 | 8.0% |
| s | 136488 | 6.5% |
| A | 117285 | 5.6% |
| G | 117285 | 5.6% |
| r | 117285 | 5.6% |
| a | 117285 | 5.6% |
| d | 117285 | 5.6% |
| e | 117285 | 5.6% |
| Other values (18) | 671370 |
Year
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.2 MiB |
| 2018-19 | |
|---|---|
| 2017-18 | |
| 2016-17 | |
| 2015-16 | |
| 2014-15 |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Characters and Unicode
| Total characters | 4223912 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2013-14 |
|---|---|
| 2nd row | 2013-14 |
| 3rd row | 2013-14 |
| 4th row | 2013-14 |
| 5th row | 2013-14 |
| Value | Count | Frequency (%) |
| 2018-19 | 102645 | |
| 2017-18 | 102496 | |
| 2016-17 | 102411 | |
| 2015-16 | 100911 | |
| 2014-15 | 98739 | |
| 2013-14 | 96214 |
| Value | Count | Frequency (%) |
| 2018-19 | 102645 | |
| 2017-18 | 102496 | |
| 2016-17 | 102411 | |
| 2015-16 | 100911 | |
| 2014-15 | 98739 | |
| 2013-14 | 96214 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 1206832 | |
| 2 | 603416 | |
| 0 | 603416 | |
| - | 603416 | |
| 8 | 205141 | 4.9% |
| 7 | 204907 | 4.9% |
| 6 | 203322 | 4.8% |
| 5 | 199650 | 4.7% |
| 4 | 194953 | 4.6% |
| 9 | 102645 | 2.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3620496 | |
| Dash Punctuation | 603416 | 14.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 1206832 | |
| 2 | 603416 | |
| 0 | 603416 | |
| 8 | 205141 | 5.7% |
| 7 | 204907 | 5.7% |
| 6 | 203322 | 5.6% |
| 5 | 199650 | 5.5% |
| 4 | 194953 | 5.4% |
| 9 | 102645 | 2.8% |
| 3 | 96214 | 2.7% |
| Value | Count | Frequency (%) |
| - | 603416 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 4223912 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 1 | 1206832 | |
| 2 | 603416 | |
| 0 | 603416 | |
| - | 603416 | |
| 8 | 205141 | 4.9% |
| 7 | 204907 | 4.9% |
| 6 | 203322 | 4.8% |
| 5 | 199650 | 4.7% |
| 4 | 194953 | 4.6% |
| 9 | 102645 | 2.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4223912 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 1206832 | |
| 2 | 603416 | |
| 0 | 603416 | |
| - | 603416 | |
| 8 | 205141 | 4.9% |
| 7 | 204907 | 4.9% |
| 6 | 203322 | 4.8% |
| 5 | 199650 | 4.7% |
| 4 | 194953 | 4.6% |
| 9 | 102645 | 2.4% |
Demographic Variable
Categorical
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.2 MiB |
| All Students | |
|---|---|
| Female | |
| Male | |
| Hispanic | |
| SWD | |
| Other values (9) |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 6.603374786 |
| Min length | 3 |
Characters and Unicode
| Total characters | 3984582 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | All Students |
|---|---|
| 2nd row | All Students |
| 3rd row | All Students |
| 4th row | All Students |
| 5th row | All Students |
| Value | Count | Frequency (%) |
| All Students | 61409 | |
| Female | 58917 | |
| Male | 58760 | |
| Hispanic | 52350 | |
| SWD | 52001 | |
| Not SWD | 50016 | |
| Poverty | 47929 | |
| Not Poverty | 46919 | |
| Not ELL | 40463 | |
| Black | 40209 | |
| Other values (4) | 94443 |
| Value | Count | Frequency (%) |
| not | 137398 | |
| swd | 102017 | |
| poverty | 94848 | |
| ell | 77411 | |
| students | 61409 | |
| all | 61409 | |
| female | 58917 | |
| male | 58760 | |
| hispanic | 52350 | 6.5% |
| black | 40209 | 5.0% |
| Other values (3) | 57495 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 387987 | 9.7% |
| e | 365774 | 9.2% |
| l | 280704 | 7.0% |
| a | 234808 | 5.9% |
| o | 232246 | 5.8% |
| 198807 | 5.0% | |
| S | 163426 | 4.1% |
| L | 154822 | 3.9% |
| i | 153569 | 3.9% |
| n | 138331 | 3.5% |
| Other values (22) | 1674108 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2624696 | |
| Uppercase Letter | 1161079 | |
| Space Separator | 198807 | 5.0% |
Most frequent character per category
| Value | Count | Frequency (%) |
| t | 387987 | |
| e | 365774 | |
| l | 280704 | |
| a | 234808 | |
| o | 232246 | |
| i | 153569 | 5.9% |
| n | 138331 | 5.3% |
| s | 138331 | 5.3% |
| r | 103474 | 3.9% |
| v | 94848 | 3.6% |
| Other values (8) | 494624 |
| Value | Count | Frequency (%) |
| S | 163426 | |
| L | 154822 | |
| N | 137398 | |
| W | 126314 | |
| D | 102017 | |
| P | 94848 | |
| A | 85981 | |
| E | 77411 | |
| F | 58917 | 5.1% |
| M | 58760 | 5.1% |
| Other values (3) | 101185 |
| Value | Count | Frequency (%) |
| 198807 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3785775 | |
| Common | 198807 | 5.0% |
Most frequent character per script
| Value | Count | Frequency (%) |
| t | 387987 | 10.2% |
| e | 365774 | 9.7% |
| l | 280704 | 7.4% |
| a | 234808 | 6.2% |
| o | 232246 | 6.1% |
| S | 163426 | 4.3% |
| L | 154822 | 4.1% |
| i | 153569 | 4.1% |
| n | 138331 | 3.7% |
| s | 138331 | 3.7% |
| Other values (21) | 1535777 |
| Value | Count | Frequency (%) |
| 198807 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3984582 |
Most frequent character per block
| Value | Count | Frequency (%) |
| t | 387987 | 9.7% |
| e | 365774 | 9.2% |
| l | 280704 | 7.0% |
| a | 234808 | 5.9% |
| o | 232246 | 5.8% |
| 198807 | 5.0% | |
| S | 163426 | 4.1% |
| L | 154822 | 3.9% |
| i | 153569 | 3.9% |
| n | 138331 | 3.5% |
| Other values (22) | 1674108 |
# Days Absent
Real number (ℝ≥0)
| Distinct | 16793 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1495.26026 |
|---|---|
| Minimum | 0 |
| Maximum | 105055 |
| Zeros | 7 |
| Zeros (%) | < 0.1% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 97 |
| Q1 | 299 |
| median | 647 |
| Q3 | 1425 |
| 95-th percentile | 6026 |
| Maximum | 105055 |
| Range | 105055 |
| Interquartile range (IQR) | 1126 |
Descriptive statistics
| Standard deviation | 2873.307826 |
|---|---|
| Coefficient of variation (CV) | 1.921610507 |
| Kurtosis | 117.776258 |
| Mean | 1495.26026 |
| Median Absolute Deviation (MAD) | 430 |
| Skewness | 7.939262911 |
| Sum | 902263965 |
| Variance | 8255897.862 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 158 | 685 | 0.1% |
| 126 | 680 | 0.1% |
| 121 | 664 | 0.1% |
| 115 | 661 | 0.1% |
| 141 | 658 | 0.1% |
| 185 | 657 | 0.1% |
| 174 | 653 | 0.1% |
| 227 | 649 | 0.1% |
| 201 | 649 | 0.1% |
| 156 | 648 | 0.1% |
| Other values (16783) | 596812 |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 2 | 1 | < 0.1% |
| 3 | 5 | |
| 4 | 4 | |
| 5 | 5 |
| Value | Count | Frequency (%) |
| 105055 | 1 | |
| 101087 | 1 | |
| 86903 | 1 | |
| 86895 | 1 | |
| 86513 | 1 |
# Days Present
Real number (ℝ≥0)
| Distinct | 80995 |
|---|---|
| Distinct (%) | 13.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16902.32221 |
|---|---|
| Minimum | 8 |
| Maximum | 934266 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 8 |
|---|---|
| 5-th percentile | 1362 |
| Q1 | 3680 |
| median | 8017 |
| Q3 | 16040.25 |
| 95-th percentile | 67067 |
| Maximum | 934266 |
| Range | 934258 |
| Interquartile range (IQR) | 12360.25 |
Descriptive statistics
| Standard deviation | 30097.53948 |
|---|---|
| Coefficient of variation (CV) | 1.780674815 |
| Kurtosis | 76.22062149 |
| Mean | 16902.32221 |
| Median Absolute Deviation (MAD) | 5144 |
| Skewness | 6.426763123 |
| Sum | 1.019913166 × 1010 |
| Variance | 905861882.8 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1020 | 91 | < 0.1% |
| 997 | 89 | < 0.1% |
| 1525 | 87 | < 0.1% |
| 1209 | 84 | < 0.1% |
| 1032 | 84 | < 0.1% |
| 1039 | 84 | < 0.1% |
| 1013 | 82 | < 0.1% |
| 1198 | 82 | < 0.1% |
| 1184 | 82 | < 0.1% |
| 1366 | 81 | < 0.1% |
| Other values (80985) | 602570 |
| Value | Count | Frequency (%) |
| 8 | 1 | |
| 21 | 1 | |
| 95 | 1 | |
| 132 | 1 | |
| 159 | 1 |
| Value | Count | Frequency (%) |
| 934266 | 1 | |
| 922543 | 1 | |
| 915010 | 2 | |
| 904750 | 1 | |
| 886967 | 1 |
| Distinct | 665 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 91.29376632 |
|---|---|
| Minimum | 0.7 |
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 0.7 |
|---|---|
| 5-th percentile | 81.1 |
| Q1 | 89.4 |
| median | 92.6 |
| Q3 | 94.7 |
| 95-th percentile | 96.9 |
| Maximum | 100 |
| Range | 99.3 |
| Interquartile range (IQR) | 5.3 |
Descriptive statistics
| Standard deviation | 5.341073645 |
|---|---|
| Coefficient of variation (CV) | 0.05850425346 |
| Kurtosis | 10.96648497 |
| Mean | 91.29376632 |
| Median Absolute Deviation (MAD) | 2.5 |
| Skewness | -2.352002407 |
| Sum | 55088119.3 |
| Variance | 28.52706768 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 94.1 | 7549 | 1.3% |
| 94.3 | 7362 | 1.2% |
| 93.9 | 7359 | 1.2% |
| 94.5 | 7344 | 1.2% |
| 94 | 7330 | 1.2% |
| 93.8 | 7325 | 1.2% |
| 94.6 | 7300 | 1.2% |
| 94.2 | 7278 | 1.2% |
| 93.6 | 7241 | 1.2% |
| 94.4 | 7235 | 1.2% |
| Other values (655) | 530093 |
| Value | Count | Frequency (%) |
| 0.7 | 1 | |
| 1.7 | 1 | |
| 9.9 | 1 | |
| 12 | 1 | |
| 12.1 | 1 |
| Value | Count | Frequency (%) |
| 100 | 8 | |
| 99.9 | 3 | < 0.1% |
| 99.8 | 9 | |
| 99.7 | 14 | |
| 99.6 | 19 |
| Distinct | 984 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.72095188 |
|---|---|
| Minimum | 0 |
| Maximum | 100 |
| Zeros | 18887 |
| Zeros (%) | 3.1% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3.1 |
| Q1 | 13.5 |
| median | 25 |
| Q3 | 39.4 |
| 95-th percentile | 61.3 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 25.9 |
Descriptive statistics
| Standard deviation | 18.06207498 |
|---|---|
| Coefficient of variation (CV) | 0.6515676324 |
| Kurtosis | 0.04975180387 |
| Mean | 27.72095188 |
| Median Absolute Deviation (MAD) | 12.5 |
| Skewness | 0.6709151033 |
| Sum | 16727265.9 |
| Variance | 326.2385527 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 18887 | 3.1% |
| 33.3 | 10646 | 1.8% |
| 50 | 9534 | 1.6% |
| 25 | 8414 | 1.4% |
| 16.7 | 7247 | 1.2% |
| 20 | 6648 | 1.1% |
| 14.3 | 6222 | 1.0% |
| 28.6 | 5454 | 0.9% |
| 12.5 | 5211 | 0.9% |
| 40 | 4670 | 0.8% |
| Other values (974) | 520483 |
| Value | Count | Frequency (%) |
| 0 | 18887 | |
| 0.2 | 1 | < 0.1% |
| 0.3 | 11 | < 0.1% |
| 0.4 | 41 | < 0.1% |
| 0.5 | 44 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 336 | |
| 99 | 1 | < 0.1% |
| 98.9 | 1 | < 0.1% |
| 98.6 | 1 | < 0.1% |
| 98.5 | 2 | < 0.1% |
| Distinct | 972 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 149460 |
| Missing (%) | 24.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.37607984 |
|---|---|
| Minimum | 0 |
| Maximum | 100 |
| Zeros | 11965 |
| Zeros (%) | 2.0% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3.4 |
| Q1 | 13.3 |
| median | 25 |
| Q3 | 38.9 |
| 95-th percentile | 60 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 25.6 |
Descriptive statistics
| Standard deviation | 17.70309592 |
|---|---|
| Coefficient of variation (CV) | 0.6466629271 |
| Kurtosis | 0.03709800431 |
| Mean | 27.37607984 |
| Median Absolute Deviation (MAD) | 12.5 |
| Skewness | 0.6705077007 |
| Sum | 12427535.7 |
| Variance | 313.3996053 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 11965 | 2.0% |
| 33.3 | 7139 | 1.2% |
| 50 | 6281 | 1.0% |
| 25 | 5911 | 1.0% |
| 20 | 4804 | 0.8% |
| 16.7 | 4622 | 0.8% |
| 14.3 | 4048 | 0.7% |
| 28.6 | 3574 | 0.6% |
| 12.5 | 3532 | 0.6% |
| 40 | 3360 | 0.6% |
| Other values (962) | 398720 | |
| (Missing) | 149460 | 24.8% |
| Value | Count | Frequency (%) |
| 0 | 11965 | |
| 0.2 | 1 | < 0.1% |
| 0.3 | 6 | < 0.1% |
| 0.4 | 39 | < 0.1% |
| 0.5 | 35 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 148 | |
| 98.6 | 1 | < 0.1% |
| 98.5 | 1 | < 0.1% |
| 98.3 | 1 | < 0.1% |
| 98.1 | 1 | < 0.1% |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 149460 |
| Missing (%) | 24.8% |
| Memory size | 9.2 MiB |
| Medium | |
|---|---|
| High | |
| Low |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 5.165637198 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2344972 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Medium |
|---|---|
| 2nd row | High |
| 3rd row | Medium |
| 4th row | Medium |
| 5th row | Medium |
| Value | Count | Frequency (%) |
| Medium | 302151 | |
| High | 76651 | 12.7% |
| Low | 75154 | 12.5% |
| (Missing) | 149460 |
| Value | Count | Frequency (%) |
| medium | 302151 | |
| high | 76651 | 16.9% |
| low | 75154 | 16.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 378802 | |
| M | 302151 | |
| e | 302151 | |
| d | 302151 | |
| u | 302151 | |
| m | 302151 | |
| H | 76651 | 3.3% |
| g | 76651 | 3.3% |
| h | 76651 | 3.3% |
| L | 75154 | 3.2% |
| Other values (2) | 150308 | 6.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1891016 | |
| Uppercase Letter | 453956 | 19.4% |
Most frequent character per category
| Value | Count | Frequency (%) |
| i | 378802 | |
| e | 302151 | |
| d | 302151 | |
| u | 302151 | |
| m | 302151 | |
| g | 76651 | 4.1% |
| h | 76651 | 4.1% |
| o | 75154 | 4.0% |
| w | 75154 | 4.0% |
| Value | Count | Frequency (%) |
| M | 302151 | |
| H | 76651 | 16.9% |
| L | 75154 | 16.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2344972 |
Most frequent character per script
| Value | Count | Frequency (%) |
| i | 378802 | |
| M | 302151 | |
| e | 302151 | |
| d | 302151 | |
| u | 302151 | |
| m | 302151 | |
| H | 76651 | 3.3% |
| g | 76651 | 3.3% |
| h | 76651 | 3.3% |
| L | 75154 | 3.2% |
| Other values (2) | 150308 | 6.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2344972 |
Most frequent character per block
| Value | Count | Frequency (%) |
| i | 378802 | |
| M | 302151 | |
| e | 302151 | |
| d | 302151 | |
| u | 302151 | |
| m | 302151 | |
| H | 76651 | 3.3% |
| g | 76651 | 3.3% |
| h | 76651 | 3.3% |
| L | 75154 | 3.2% |
| Other values (2) | 150308 | 6.4% |
District_Number
Real number (ℝ≥0)
| Distinct | 33 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.97855211 |
|---|---|
| Minimum | 1 |
| Maximum | 75 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 9 |
| median | 17 |
| Q3 | 26 |
| 95-th percentile | 32 |
| Maximum | 75 |
| Range | 74 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 14.97024431 |
|---|---|
| Coefficient of variation (CV) | 0.7887980193 |
| Kurtosis | 5.570359878 |
| Mean | 18.97855211 |
| Median Absolute Deviation (MAD) | 9 |
| Skewness | 2.009067137 |
| Sum | 11451962 |
| Variance | 224.1082148 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 39354 | 6.5% |
| 10 | 31438 | 5.2% |
| 31 | 29853 | 4.9% |
| 27 | 26607 | 4.4% |
| 75 | 25365 | 4.2% |
| 11 | 23440 | 3.9% |
| 9 | 22973 | 3.8% |
| 24 | 22276 | 3.7% |
| 28 | 20618 | 3.4% |
| 30 | 20105 | 3.3% |
| Other values (23) | 341387 |
| Value | Count | Frequency (%) |
| 1 | 11120 | 1.8% |
| 2 | 39354 | |
| 3 | 16474 | |
| 4 | 12827 | 2.1% |
| 5 | 10645 | 1.8% |
| Value | Count | Frequency (%) |
| 75 | 25365 | |
| 32 | 9275 | 1.5% |
| 31 | 29853 | |
| 30 | 20105 | |
| 29 | 18368 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.2 MiB |
| K | |
|---|---|
| Q | |
| X | |
| M | |
| R |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 603416 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | M |
| 3rd row | M |
| 4th row | M |
| 5th row | M |
| Value | Count | Frequency (%) |
| K | 178769 | |
| Q | 148530 | |
| X | 133381 | |
| M | 110824 | |
| R | 31912 | 5.3% |
| Value | Count | Frequency (%) |
| k | 178769 | |
| q | 148530 | |
| x | 133381 | |
| m | 110824 | |
| r | 31912 | 5.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| K | 178769 | |
| Q | 148530 | |
| X | 133381 | |
| M | 110824 | |
| R | 31912 | 5.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 603416 |
Most frequent character per category
| Value | Count | Frequency (%) |
| K | 178769 | |
| Q | 148530 | |
| X | 133381 | |
| M | 110824 | |
| R | 31912 | 5.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 603416 |
Most frequent character per script
| Value | Count | Frequency (%) |
| K | 178769 | |
| Q | 148530 | |
| X | 133381 | |
| M | 110824 | |
| R | 31912 | 5.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 603416 |
Most frequent character per block
| Value | Count | Frequency (%) |
| K | 178769 | |
| Q | 148530 | |
| X | 133381 | |
| M | 110824 | |
| R | 31912 | 5.3% |
School_Number
Real number (ℝ≥0)
| Distinct | 650 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 250.0509814 |
|---|---|
| Minimum | 1 |
| Maximum | 993 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 17 |
| Q1 | 94 |
| median | 205 |
| Q3 | 367 |
| 95-th percentile | 627 |
| Maximum | 993 |
| Range | 992 |
| Interquartile range (IQR) | 273 |
Descriptive statistics
| Standard deviation | 195.0448132 |
|---|---|
| Coefficient of variation (CV) | 0.7800201866 |
| Kurtosis | 0.3921586779 |
| Mean | 250.0509814 |
| Median Absolute Deviation (MAD) | 127 |
| Skewness | 0.931765877 |
| Sum | 150884763 |
| Variance | 38042.47914 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 20 | 2599 | 0.4% |
| 48 | 2516 | 0.4% |
| 4 | 2447 | 0.4% |
| 11 | 2445 | 0.4% |
| 46 | 2396 | 0.4% |
| 75 | 2366 | 0.4% |
| 138 | 2362 | 0.4% |
| 9 | 2348 | 0.4% |
| 41 | 2260 | 0.4% |
| 36 | 2247 | 0.4% |
| Other values (640) | 579430 |
| Value | Count | Frequency (%) |
| 1 | 1853 | |
| 2 | 1476 | |
| 3 | 1875 | |
| 4 | 2447 | |
| 5 | 1924 |
| Value | Count | Frequency (%) |
| 993 | 797 | |
| 971 | 366 | |
| 964 | 535 | |
| 933 | 157 | < 0.1% |
| 907 | 14 | < 0.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 9.2 MiB |
| Brooklyn | |
|---|---|
| Queens | |
| Bronx | |
| Manhattan | |
| Staten Island |
Length
| Max length | 13 |
|---|---|
| Median length | 8 |
| Mean length | 7.29266211 |
| Min length | 5 |
Characters and Unicode
| Total characters | 4400509 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Manhattan |
|---|---|
| 2nd row | Manhattan |
| 3rd row | Manhattan |
| 4th row | Manhattan |
| 5th row | Manhattan |
| Value | Count | Frequency (%) |
| Brooklyn | 178769 | |
| Queens | 148530 | |
| Bronx | 133381 | |
| Manhattan | 110824 | |
| Staten Island | 31912 | 5.3% |
| Value | Count | Frequency (%) |
| brooklyn | 178769 | |
| queens | 148530 | |
| bronx | 133381 | |
| manhattan | 110824 | |
| staten | 31912 | 5.0% |
| island | 31912 | 5.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 746152 | |
| o | 490919 | |
| a | 396296 | |
| e | 328972 | 7.5% |
| B | 312150 | 7.1% |
| r | 312150 | 7.1% |
| t | 285472 | 6.5% |
| l | 210681 | 4.8% |
| s | 180442 | 4.1% |
| k | 178769 | 4.1% |
| Other values (10) | 958506 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3733269 | |
| Uppercase Letter | 635328 | 14.4% |
| Space Separator | 31912 | 0.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| n | 746152 | |
| o | 490919 | |
| a | 396296 | |
| e | 328972 | |
| r | 312150 | |
| t | 285472 | 7.6% |
| l | 210681 | 5.6% |
| s | 180442 | 4.8% |
| k | 178769 | 4.8% |
| y | 178769 | 4.8% |
| Other values (4) | 424647 |
| Value | Count | Frequency (%) |
| B | 312150 | |
| Q | 148530 | |
| M | 110824 | 17.4% |
| S | 31912 | 5.0% |
| I | 31912 | 5.0% |
| Value | Count | Frequency (%) |
| 31912 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4368597 | |
| Common | 31912 | 0.7% |
Most frequent character per script
| Value | Count | Frequency (%) |
| n | 746152 | |
| o | 490919 | |
| a | 396296 | |
| e | 328972 | 7.5% |
| B | 312150 | 7.1% |
| r | 312150 | 7.1% |
| t | 285472 | 6.5% |
| l | 210681 | 4.8% |
| s | 180442 | 4.1% |
| k | 178769 | 4.1% |
| Other values (9) | 926594 |
| Value | Count | Frequency (%) |
| 31912 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4400509 |
Most frequent character per block
| Value | Count | Frequency (%) |
| n | 746152 | |
| o | 490919 | |
| a | 396296 | |
| e | 328972 | 7.5% |
| B | 312150 | 7.1% |
| r | 312150 | 7.1% |
| t | 285472 | 6.5% |
| l | 210681 | 4.8% |
| s | 180442 | 4.1% |
| k | 178769 | 4.1% |
| Other values (10) | 958506 |
| Distinct | 1170 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 20151 |
| Missing (%) | 3.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 91.27576033 |
|---|---|
| Minimum | 1.7 |
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 1.7 |
|---|---|
| 5-th percentile | 81.55 |
| Q1 | 89.45 |
| median | 92.5 |
| Q3 | 94.55 |
| 95-th percentile | 96.75 |
| Maximum | 100 |
| Range | 98.3 |
| Interquartile range (IQR) | 5.1 |
Descriptive statistics
| Standard deviation | 5.04690642 |
|---|---|
| Coefficient of variation (CV) | 0.05529295403 |
| Kurtosis | 8.920475652 |
| Mean | 91.27576033 |
| Median Absolute Deviation (MAD) | 2.4 |
| Skewness | -2.154936517 |
| Sum | 53237956.35 |
| Variance | 25.47126442 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 93.6 | 4102 | 0.7% |
| 93.4 | 4065 | 0.7% |
| 93.9 | 4048 | 0.7% |
| 94 | 4032 | 0.7% |
| 93.1 | 3965 | 0.7% |
| 94.4 | 3947 | 0.7% |
| 94.6 | 3935 | 0.7% |
| 94.5 | 3933 | 0.7% |
| 93.5 | 3862 | 0.6% |
| 95.1 | 3738 | 0.6% |
| Other values (1160) | 543638 | |
| (Missing) | 20151 | 3.3% |
| Value | Count | Frequency (%) |
| 1.7 | 1 | < 0.1% |
| 16.4 | 4 | |
| 23.9 | 4 | |
| 25.1 | 1 | < 0.1% |
| 28.6 | 2 |
| Value | Count | Frequency (%) |
| 100 | 10 | |
| 99.85 | 6 | |
| 99.8 | 6 | |
| 99.75 | 4 | < 0.1% |
| 99.7 | 2 | < 0.1% |
| Distinct | 2674 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 20151 |
| Missing (%) | 3.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28.00867402 |
|---|---|
| Minimum | 0 |
| Maximum | 100 |
| Zeros | 7352 |
| Zeros (%) | 1.2% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4.5 |
| Q1 | 14.55 |
| median | 25.6 |
| Q3 | 39.35 |
| 95-th percentile | 59.2 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 24.8 |
Descriptive statistics
| Standard deviation | 17.0425317 |
|---|---|
| Coefficient of variation (CV) | 0.6084733499 |
| Kurtosis | -0.07234879048 |
| Mean | 28.00867402 |
| Median Absolute Deviation (MAD) | 12.1 |
| Skewness | 0.6118168871 |
| Sum | 16336479.25 |
| Variance | 290.4478869 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 7352 | 1.2% |
| 50 | 3176 | 0.5% |
| 25 | 2841 | 0.5% |
| 33.3 | 2556 | 0.4% |
| 12.5 | 2036 | 0.3% |
| 20 | 1919 | 0.3% |
| 14.3 | 1888 | 0.3% |
| 16.7 | 1842 | 0.3% |
| 28.6 | 1797 | 0.3% |
| 40 | 1561 | 0.3% |
| Other values (2664) | 556297 | |
| (Missing) | 20151 | 3.3% |
| Value | Count | Frequency (%) |
| 0 | 7352 | |
| 0.15 | 6 | < 0.1% |
| 0.2 | 6 | < 0.1% |
| 0.25 | 6 | < 0.1% |
| 0.3 | 11 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 73 | |
| 98.5 | 4 | < 0.1% |
| 97.35 | 4 | < 0.1% |
| 96.8 | 3 | < 0.1% |
| 96.6 | 1 | < 0.1% |
| Distinct | 6297 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 2506 |
| Missing (%) | 0.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 91.3265413 |
|---|---|
| Minimum | 0.7 |
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 0.7 |
|---|---|
| 5-th percentile | 81.46 |
| Q1 | 89.55 |
| median | 92.54 |
| Q3 | 94.6 |
| 95-th percentile | 96.72 |
| Maximum | 100 |
| Range | 99.3 |
| Interquartile range (IQR) | 5.05 |
Descriptive statistics
| Standard deviation | 5.021769394 |
|---|---|
| Coefficient of variation (CV) | 0.0549869657 |
| Kurtosis | 10.31706588 |
| Mean | 91.3265413 |
| Median Absolute Deviation (MAD) | 2.36 |
| Skewness | -2.223978674 |
| Sum | 54879031.93 |
| Variance | 25.21816784 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 94.6 | 1678 | 0.3% |
| 94.1 | 1627 | 0.3% |
| 94.5 | 1569 | 0.3% |
| 94.4 | 1534 | 0.3% |
| 94.9 | 1530 | 0.3% |
| 94.7 | 1500 | 0.2% |
| 93.9 | 1470 | 0.2% |
| 94.3 | 1460 | 0.2% |
| 93.5 | 1455 | 0.2% |
| 92.5 | 1448 | 0.2% |
| Other values (6287) | 585639 | |
| (Missing) | 2506 | 0.4% |
| Value | Count | Frequency (%) |
| 0.7 | 1 | |
| 1.7 | 1 | |
| 13.2 | 1 | |
| 13.8 | 1 | |
| 16 | 1 |
| Value | Count | Frequency (%) |
| 100 | 4 | |
| 99.7 | 8 | |
| 99.6 | 3 | < 0.1% |
| 99.55 | 3 | < 0.1% |
| 99.5 | 2 | < 0.1% |
| Distinct | 13754 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 2506 |
| Missing (%) | 0.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.45054814 |
|---|---|
| Minimum | 0 |
| Maximum | 100 |
| Zeros | 3449 |
| Zeros (%) | 0.6% |
| Memory size | 9.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4.825 |
| Q1 | 14.3 |
| median | 25.18 |
| Q3 | 38.35 |
| 95-th percentile | 57.8 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 24.05 |
Descriptive statistics
| Standard deviation | 16.51960436 |
|---|---|
| Coefficient of variation (CV) | 0.6017950635 |
| Kurtosis | -0.05128493758 |
| Mean | 27.45054814 |
| Median Absolute Deviation (MAD) | 11.78 |
| Skewness | 0.6204422603 |
| Sum | 16495308.88 |
| Variance | 272.8973282 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 3449 | 0.6% |
| 33.3 | 904 | 0.1% |
| 50 | 902 | 0.1% |
| 16.7 | 891 | 0.1% |
| 14.3 | 873 | 0.1% |
| 25 | 789 | 0.1% |
| 12.5 | 667 | 0.1% |
| 28.6 | 633 | 0.1% |
| 11.1 | 570 | 0.1% |
| 22.2 | 516 | 0.1% |
| Other values (13744) | 590716 | |
| (Missing) | 2506 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 3449 | |
| 0.22 | 6 | < 0.1% |
| 0.24 | 6 | < 0.1% |
| 0.25 | 10 | < 0.1% |
| 0.26 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 100 | 112 | |
| 99.5 | 2 | < 0.1% |
| 99.46666667 | 3 | < 0.1% |
| 99.25 | 6 | < 0.1% |
| 99.16666667 | 3 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| DBN | School Name | Grade | Year | Demographic Variable | # Days Absent | # Days Present | % Attendance | % Chronically Absent | Next Year % Chronically Absent | Chronically_Absent_Next_Year | District_Number | Borough_Code | School_Number | Borough_Name | % Attendance_2_yr_avg | % Chronically Absent_2_yr_avg | % Attendance_5_yr_avg | % Chronically Absent_5_yr_avg | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 01M015 | P.S. 015 Roberto Clemente | All Grades | 2013-14 | All Students | 2783.0 | 32020.0 | 92.0 | 26.9 | 23.4 | Medium | 01 | M | 015 | Manhattan | 93.65 | 21.95 | 93.060 | 24.320 |
| 1 | 01M015 | P.S. 015 Roberto Clemente | PK in K-12 Schools | 2013-14 | All Students | 560.0 | 4151.0 | 88.1 | 53.3 | 65.2 | High | 01 | M | 015 | Manhattan | 92.60 | 22.20 | 88.775 | 47.675 |
| 2 | 01M015 | P.S. 015 Roberto Clemente | 0K | 2013-14 | All Students | 659.0 | 6414.0 | 90.7 | 29.5 | 30.0 | Medium | 01 | M | 015 | Manhattan | 92.55 | 30.45 | 91.720 | 30.560 |
| 3 | 01M015 | P.S. 015 Roberto Clemente | 1 | 2013-14 | All Students | 525.0 | 6214.0 | 92.2 | 31.0 | 18.8 | Medium | 01 | M | 015 | Manhattan | 93.95 | 16.45 | 93.160 | 25.640 |
| 4 | 01M015 | P.S. 015 Roberto Clemente | 2 | 2013-14 | All Students | 308.0 | 3680.0 | 92.3 | 20.0 | 21.9 | Medium | 01 | M | 015 | Manhattan | 93.75 | 20.60 | 93.620 | 19.540 |
| 5 | 01M015 | P.S. 015 Roberto Clemente | 3 | 2013-14 | All Students | 239.0 | 3084.0 | 92.8 | 20.0 | 10.5 | Medium | 01 | M | 015 | Manhattan | 94.30 | 20.70 | 93.800 | 18.380 |
| 6 | 01M015 | P.S. 015 Roberto Clemente | 4 | 2013-14 | All Students | 301.0 | 4390.0 | 93.6 | 17.2 | 11.1 | Medium | 01 | M | 015 | Manhattan | 93.65 | 20.55 | 94.200 | 16.240 |
| 7 | 01M015 | P.S. 015 Roberto Clemente | 5 | 2013-14 | All Students | 191.0 | 4087.0 | 95.5 | 7.7 | 7.4 | Low | 01 | M | 015 | Manhattan | 95.00 | 10.20 | 95.160 | 10.440 |
| 8 | 01M019 | P.S. 019 Asher Levy | All Grades | 2013-14 | All Students | 4070.0 | 46883.0 | 92.0 | 26.4 | 30.5 | Medium | 01 | M | 019 | Manhattan | 90.95 | 34.10 | 91.360 | 31.940 |
| 9 | 01M019 | P.S. 019 Asher Levy | PK in K-12 Schools | 2013-14 | All Students | 664.0 | 5498.0 | 89.2 | 37.8 | 32.3 | Medium | 01 | M | 019 | Manhattan | 88.50 | 33.60 | 89.140 | 39.640 |
Last rows
| DBN | School Name | Grade | Year | Demographic Variable | # Days Absent | # Days Present | % Attendance | % Chronically Absent | Next Year % Chronically Absent | Chronically_Absent_Next_Year | District_Number | Borough_Code | School_Number | Borough_Name | % Attendance_2_yr_avg | % Chronically Absent_2_yr_avg | % Attendance_5_yr_avg | % Chronically Absent_5_yr_avg | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 603406 | 75X811 | P.S. X811 | All Grades | 2018-19 | ELL | 7728.0 | 34631.0 | 81.8 | 62.7 | NaN | NaN | 75 | X | 811 | Bronx | 81.95 | 63.90 | 83.540000 | 59.660000 |
| 603407 | 75X811 | P.S. X811 | All Grades | 2018-19 | Not ELL | 11025.0 | 58922.0 | 84.2 | 51.7 | NaN | NaN | 75 | X | 811 | Bronx | 84.15 | 52.15 | 83.980000 | 53.420000 |
| 603408 | 75X811 | P.S. X811 | 9 | 2018-19 | ELL | 598.0 | 3203.0 | 84.3 | 62.5 | NaN | NaN | 75 | X | 811 | Bronx | 82.80 | 60.40 | 84.925000 | 51.900000 |
| 603409 | 75X811 | P.S. X811 | 9 | 2018-19 | Not ELL | 1848.0 | 8428.0 | 82.0 | 60.9 | NaN | NaN | 75 | X | 811 | Bronx | 84.90 | 50.00 | 85.125000 | 50.400000 |
| 603410 | 75X811 | P.S. X811 | 10 | 2018-19 | ELL | 1222.0 | 7116.0 | 85.3 | 49.0 | NaN | NaN | 75 | X | 811 | Bronx | 79.00 | 68.40 | 82.733333 | 56.266667 |
| 603411 | 75X811 | P.S. X811 | 10 | 2018-19 | Not ELL | 1899.0 | 10105.0 | 84.2 | 52.8 | NaN | NaN | 75 | X | 811 | Bronx | 85.90 | 45.20 | 85.333333 | 48.733333 |
| 603412 | 75X811 | P.S. X811 | 11 | 2018-19 | ELL | 1071.0 | 4239.0 | 79.8 | 66.7 | NaN | NaN | 75 | X | 811 | Bronx | 84.90 | 61.90 | 86.600000 | 53.666667 |
| 603413 | 75X811 | P.S. X811 | 11 | 2018-19 | Not ELL | 1359.0 | 8382.0 | 86.0 | 55.2 | NaN | NaN | 75 | X | 811 | Bronx | 82.90 | 53.50 | 83.033333 | 51.300000 |
| 603414 | 75X811 | P.S. X811 | 12 | 2018-19 | ELL | 4837.0 | 20073.0 | 80.6 | 66.7 | NaN | NaN | 75 | X | 811 | Bronx | 82.55 | 62.65 | 82.880000 | 63.720000 |
| 603415 | 75X811 | P.S. X811 | 12 | 2018-19 | Not ELL | 5849.0 | 31639.0 | 84.4 | 47.4 | NaN | NaN | 75 | X | 811 | Bronx | 83.10 | 53.35 | 83.380000 | 55.440000 |